Goto

Collaborating Authors

 complete case analysis




Review for NeurIPS paper: A Robust Functional EM Algorithm for Incomplete Panel Count Data

Neural Information Processing Systems

Weaknesses: - The MCAR assumption is difficult to justify in practice. This is good, however, could the authors clarify some of the following points regarding their method in the context of MCAR missingness. By definition, MCAR implies that one can simply ignore any rows of data containing missingness and restricting the analysis to so called "complete cases" will still result in unbiased estimates of the parameter of interest. In light of this, and the bounds on \epsilon implying that there will always be complete cases in the data as n - \infty (if this were not true, the parameters of interest would not be identifiable) what is the advantage of the proposed EM algorithm over simply doing complete case analysis and using some of the older tools cited in the paper that can be run on complete data. I apologize if I missed this, but it doesn't seem like there's a baseline comparison to such a complete case analysis or to the alternative of directly maximizing the observed data likelihood by integrating according to patterns of missingness.


Bounds and Sensitivity Analysis of the Causal Effect Under Outcome-Independent MNAR Confounding

Peña, Jose M.

arXiv.org Machine Learning

We report assumption-free bounds for any contrast between the probabilities of the potential outcome under exposure and non-exposure when the confounders are missing not at random. We assume that the missingness mechanism is outcome-independent. We also report a sensitivity analysis method to complement our bounds.


Full Information Linked ICA: addressing missing data problem in multimodal fusion

Li, Ruiyang, Bowman, F. DuBois, Lee, Seonjoo

arXiv.org Machine Learning

Recent advances in multimodal imaging acquisition techniques have allowed us to measure different aspects of brain structure and function. Multimodal fusion, such as linked independent component analysis (LICA), is popularly used to integrate complementary information. However, it has suffered from missing data, commonly occurring in neuroimaging data. Therefore, in this paper, we propose a Full Information LICA algorithm (FI-LICA) to handle the missing data problem during multimodal fusion under the LICA framework. Built upon complete cases, our method employs the principle of full information and utilizes all available information to recover the missing latent information. Our simulation experiments showed the ideal performance of FI-LICA compared to current practices. Further, we applied FI-LICA to multimodal data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, showcasing better performance in classifying current diagnosis and in predicting the AD transition of participants with mild cognitive impairment (MCI), thereby highlighting the practical utility of our proposed method.


Imputation of missing values in multi-view data

van Loon, Wouter, Fokkema, Marjolein, de Rooij, Mark

arXiv.org Artificial Intelligence

Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This leads to very large quantities of missing data which, especially when combined with high-dimensionality, makes the application of conditional imputation methods computationally infeasible. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.